Improving Robustness of Speaker Verification Against Mimicked Speech

نویسندگان

  • Kuruvachan K. George
  • C. Santhosh Kumar
  • K. I. Ramachandran
  • Ashish Panda
  • Amrita Vishwa Vidyapeetham
چکیده

Making speaker verification (SV) systems robust to spoofed/mimicked speech attacks is very important to make its use effective in security applications. In this work, we show that using a proximal support vector machine backend classifier with i-vectors as inputs (i-PSVM) can help improve the performance of SV systems for mimicked speech as non-target trials. We compared our results with the state-of-the-art baseline i-vector with cosine distance scoring (i-CDS), i-vector with a backend SVM classifier (i-SVM) and cosine distance features with an SVM backend classifier (CDF-SVM) systems. In iPSVM, proximity of the test utterance to the target and nontarget class is the criteria for decision making while in i-SVM, the distance from the separating hyperplane is the criteria for the decision. It was seen that the i-PSVM approach is advantageous when tested with mimicked speech as non-target trials. This highlights that proximity to the target speakers is a better criteria for speaker verification for mimicked speech. Further, we note that weighting the target and non-target class examples helps us further fine tune the performance of i-PSVM. We then devised a strategy for estimating the weights for every example based on its cosine distance similarity with respect to the centroid of target class examples. The final i-PSVM with example based weighting scheme achieved an improvement of 3.39% absolute in EER when compared to the best baseline system, iSVM. Subsequently, we fused the i-PSVM and i-SVM systems and results show that the performance of the combined system is better than the individual systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

"STC spoofing" database for text-dependent speaker recognition evaluation

The paper describes the “STC Spoofing” database, which consists of a set of recordings of “live” speech by several speakers, as well as synthesized speech fragments obtained using a TTS engine based on these speakers’ voices. The database can be used for testing the robustness of textdependent speaker verification systems against spoofing attacks, as well as for research and development of meth...

متن کامل

Robust speaker recognition using microphone arrays

This paper investigates the use of microphone arrays in handsfree speaker recognition systems. Hands-free operation is preferable in many potential speaker recognition applications, however obtaining acceptable performance with a single distant microphone is problematic in real noise conditions. A possible solution to this problem is the use of microphone arrays, which have the capacity to enha...

متن کامل

Noise robust speaker verification with delta cepstrum normalization

This paper introduces a delta cepstrum normalization (DCN) technique for speaker verification under noisy conditions. Cepstral feature normalization techniques are widely used to mitigate spectral variations caused by various types of noise; however, little attention has been paid to normalizing delta features. A DCN technique that normalizes not only base features but also delta-features was r...

متن کامل

Peter Balazs Speaker Verification using Pole / Zero Estimates of Nasals

The acoustics of nasals are an important source of speakerdiscriminating features. Nasal spectra contain poles and zeros dependent upon nasal cavities which are complex static structures which vary from person to person. Nasal spectra may therefore have low withinspeaker and high between-speaker variability. This study applies a recent pole-zero model estimation technique based on a logarithmic...

متن کامل

Imposture using synthetic speech against speaker verification based on spectrum and pitch

This paper describes security of speaker verification systems against imposture using synthetic speech. We propose a text-prompted speaker verification technique which utilizes pitch information in addition to spectral information, and investigate whether synthetic speech is rejected. Experimental results show that pitch information is not necessarily useful for rejection of synthetic speech, a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016